iT邦幫忙

第 12 屆 iThome 鐵人賽

DAY 16
0
AI & Data

花甲老頭學 AI系列 第 16

[鐵人12:Day 16] GPT-3 (3):沒什麼了不起!

  • 分享至 

  • xImage
  •  

(博君一笑:https://youtu.be/fN7ultKGPmU )

並不是所有 AI 社群內的人都認為 GPT-3 是不項了不起的成就,今年 8 月,在 MIT Technology Review 上有一篇文章,它的標題是「GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about」(註一),把 GPT-3 稱之為「誇大、吹牛的人」,認為 GPT-3 「根本就不知道它在講什麼!」。

接下來我們就來看看作者的觀點。

我們不認為 GPT-3 是邁向通用人工智慧的重要進展,乍看之下,它似乎可以產生像真人所說出來的語句。但是如果深入研究下去,你會發現儘管這些語句文法正確而且通順,但它們缺少了對現實世界的理解,也就是說,你不能信任這些話!

We doubt it. At first glance, GPT-3 seems to have an impressive ability to produce human-like text. And we don't doubt that it can used to produce entertaining surrealist fiction; other commercial applications may emerge as well. But accuracy is not its strong point. If you dig deeper, you discover that something’s amiss: although its output is grammatical, and even impressively idiomatic, its comprehension of the world is often seriously off, which means you can never really trust what it says.

作者點出了 GPT-3 的一個重要的問題 -- 無法讓人信任 (trust)。

作者從幾個不同的領域測試 GPT-3,其中一個有關生物學上的推理測試:

輸入:You poured yourself a glass of cranberry juice, but then you absentmindedly poured about a teaspoon of grape juice into it. It looks okay. You try sniffing it, but you have a bad cold, so you can’t smell anything. You are very thirsty. So

輸出:you drink it. You are now dead.

作者評論:GPT-3 似乎認為葡萄汁有毒!

物理學上的推理測試:

輸入:You are having a small dinner party. You want to serve dinner in the living room. The dining room table is wider than the doorway, so to get it into the living room, you will have to

輸出:remove the door. You have a table saw, so you cut the door in half and remove the top half.

作者評論:一般的辦法是將桌面打直,換另一個角度進門,或是把桌腳卸下。沒有人會把門切一半!而且也不能使用桌鋸來鋸門。

社會學上的推理測試:

輸入:You are a defense lawyer and you have to go to court today. Getting dressed in the morning, you discover that your suit pants are badly stained. However, your bathing suit is clean and very stylish. In fact, it’s expensive French couture; it was a birthday present from Isabel. You decide that you should wear

輸出:the bathing suit to court. You arrive at the courthouse and are met by a bailiff who escorts you to the courtroom.

作者評論:如果你真的穿浴衣去法院,法警應該不會讓你進去,就算進去了,也會被法官視為藐視法庭。

物體的追踪測試:

輸入:Yesterday I dropped my clothes off at the dry cleaner’s and I have yet to pick them up. Where are my clothes?

輸出:I have a lot of clothes.

作者評論:GPT-3 閃避這個問題,它沒法追踪到物件的位置。

心理學上的推理測試:

輸入:Janet and Penny went to the store to get presents for Jack. Janet said, “I will buy Jack a top.” “Don’t get Jack a top,” says Penny. “He has a top. He will

輸出:get a top.” “I will get Jack a top,” said Janet.

作者評論:GPT-3 並沒有注意到 Penny 的反對,也忽略了 Jack 已經有了上衣的事實。

不知所云:

輸入:At the party, I poured myself a glass of lemonade, but it turned out to be too sour, so I added a little sugar. I didn’t see a spoon handy, so I stirred it with a cigarette. But that turned out to be a bad idea because

輸出:it kept falling on the floor. That’s when he decided to start the Cremation Association of North America, which has become a major cremation provider with 145 locations.

作者評論:不知其所云!

從以上這個測試來看,GPT-3 的推理能力實在是不能讓人信賴。作者指出這些弱點其實在 GPT-2 的時候已經存在了,在 GPT-3 上仍然沒有改進。相較於 GPT-2,GPT-3 使用了百倍以上的訓練資料集,它的語言的流暢性確實有了進步,更接近人類,然而 GPT-3 所學習的僅限於文句 (text) 的層次,它學習字與字之間的關係,它卻無法學習字句後面更深的概念,因此,它的邏輯推理能力依然薄弱,對於環境的了解依然膚淺。

文末,作者引用了他的同事對於 GPT-3 的一個很貼切的描述:

GPT 並不是設計成回答問題,提供正確的答案。它比較像是一個即興表演的演出者,他從來沒有出過門,他所有的知識都是由書上獲得。為了讓他的表演流暢,碰到他不懂的問題時,他總有辦法瞎掰一些東西出來。

看完了這幾天有關 GPT-3 的分享,各位 AI 人對於它,是否有了更多面的認識?身為一個 AI 研發人員,老頭覺得不論你看不看好 GPT 未來的發展,它內部所使用的技術絕對值得我們深入研究的。

(註一:全文參考 https://www.technologyreview.com/2020/08/22/1007539/gpt3-openai-language-generator-artificial-intelligence-ai-opinion/?utm_campaign=Artificial%2BIntelligence%2BWeekly&utm_medium=email&utm_source=Artificial_Intelligence_Weekly_176 )


上一篇
[鐵人12:Day 15] GPT-3 (2):試一下 GPT
下一篇
[鐵人12:Day 17] GPT-3 (4):微軟的獨家使用執照
系列文
花甲老頭學 AI30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言